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Abstract 

The object of this paper is the identihcation of Hammerstein systems, 
which are dynamic systems consisting of a static nonlinearity and a linear 
time-invariant dynamic system in cascade. We assume that the nonlinear 
function can be described as a linear combination of p basis functions. We 
model the system dynamics by means of an np-dimensional vector. This 
vector, usually referred to as overparameterized vector, contains all the 
combinations between the nonlinearity coefficients and the first n samples 
of the impulse response of the linear block. The estimation of the over¬ 
parameterized vector is performed with a new regularized kernel-based 
approach. To this end, we introduce a novel kernel tailored for overpa¬ 
rameterized models, which yields estimates that can be uniquely decom¬ 
posed as the combination of an impulse response and p coefficients of the 
static nonlinearity. As part of the work, we establish a clear connection 
between the proposed identification scheme and our recently developed 
nonparametric method based on the stable spline kernel. 


1 Introduction 

A nonlinear system is usually called an Hammerstein system when it is composed 
of two blocks in cascade, the first being a static nonlinearity and the second a 
linear time-invariant (LTI) dynamic system |16| . 

There are several areas in science and engineering where Hammerstein sys- 

, j^, 1^. For this reason, in recent years 
Hammerstein system identification has become a popular and rather active re¬ 
search topic [^, [I^ . 
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Several approaches have been proposed for Hammerstein system identifica¬ 
tion. For instance, in a kernel-based regression method is described, pro¬ 
poses an identification approach based on a stochastic approximation, while 
focuses on subspace methods. In j^, and iterative methods based on 
least-squares are studied. 

An interesting approach was proposed by Er-Wei Bai in [^. Here, the static 
nonlinearity is modeled as the linear combination of p basis functions, while the 
LTI system is assumed to be a finite impulse response (FIR) with n coefficients. 
The Hammerstein system is then modeled as a linear regression, where the 
regressor vector is np-dimensional. Since it contains all the combinations of the 
nonlinearity coefficients and the FIR coefficients, this vector is usually called 
overparameterized vector. Its estimate is obtained via least-squares and then it 
is decomposed in order to obtain the nonlinearity coefficients and the impulse 
response. Albeit proven to be asymptotically consistent, the whole procedure 
suffers of two main drawbacks. First, since it relies on a least-squares estimation 
of a possibly very high-dimensional vector, the final estimates may suffer from 
high variance [^. Second, the procedure does not guarantee that the estimated 
np-dimensional vector can be exactly decomposed to obtain the nonlinearity 
coefficients and the FIR system, and thus approximations are required. 

In this paper, we propose a regularization technique to curb the variance 
of the estimates of the overparameterized vector. Similarly to [^, we model 
the Hammerstein system dynamics using the aforementioned overparameterized 
vector, then we solve the regression problem relying on a kernel-based approach. 
To this end, we introduce a novel kernel, called the Kronecker overparameterized 
(KOP) kernel, which is the composition of a rank-one positive semi-definite 
matrix and the so-called/irst-order stable spline kernel (see |22| , |21] , [^, and 
for details). The structure of this kernel depends on a few parameters (also 
called hyperparameters in this context), which we need to estimate from data. 
This task is addressed by an empirical Bayes approach |18| , that is to say 
by maximizing the marginal likelihood (ML) of the output. Once the kernel 
parameters are fixed, the overparameterized vector is estimated via regularized 
least squares . Equivalently, we can think of the overparameterized vector as 
a Gaussian random vector with zero-mean and covariance matrix given by the 
KOP kernel. With this interpretation, the estimate corresponds to the minimum 
mean square error estimate in the Bayesian sense j^. 

A contribution of this paper is to reveal some interesting properties of the es¬ 
timated overparameterized vector provided by the proposed method. We prove 
that, as opposed to |^, this estimate can be decomposed exactly in order to 
obtain the nonlinearity coefficients and the LTI system impulse response, with 
no loss of information due to approximations. The concept of exact decompo¬ 
sition will be made clear throughout the paper. We also demonstrate strong 
connections with our recently proposed method |25| , effectively proving that, al¬ 
though the two approaches are inherently different, the estimates obtained with 
the two methods are equivalent. Finally, we show, through several numerical 
experiments, that the proposed method outperforms both the algorithm pro¬ 
posed in and the standard MATLAB system identification toolbox function 
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for Hammerstein system identification. 

The paper is organized as follows. In the next section, we formulate the 
Hammerstein system identification problem. In Section we describe the mod¬ 
eling approach based on overparameterized vectors. In Section]^ we introduce 
the proposed identification scheme, and we give some theoretical background in 
Section Numerical experiments are illustrated in Section Some conclusions 
end the paper. 

2 Problem formulation 

We consider a stable single input single output discrete-time system described 
by the following time-domain relations (see Figure]^ 

= (<S> ^ (1) 

yt = L/c=i 9kWt-k + et. 

In the above equation, /(•) represents a (static) nonlinear function transforming 
the measurable input ut into the unavailable signal Wt, which in turn feeds a 
strictly causal stable LTI system, described by the impulse response gt- The 
output measurements of the system yt are corrupted by white Gaussian noise, 
denoted by e*, which has unknown variance cr^. Following a standard approach 
in Hammerstein system identification (see e.g. i)> we assume that /(•) can be 
modeled as a combination of p known basis functions {4>i}i=i’! namely 

p 

Wt = fiut) = ^ Ci(j)i{ut ), ( 2 ) 

i=l 

where the coefficients Ci are unknown. 



Figure 1: Block scheme of the Hammerstein system. 


We assume that N input-output samples are collected, and denote them 
by For notational convenience, we also assume null initial 

conditions. Then, the system identification problem we discuss in this paper is 
the problem of estimating n samples of the impulse response, say (where 

n is large enough to capture the system dynamics), as well as the p coefficients 
{ci}t^^ characterizing the static nonlinearity /(■)• 

2.1 Non-uniqueness of the identified system 

It is well-known (see e.g. i) that the two components of a Hammerstein system 
can be determined up to a scaling factor. In fact, for any a S K, every pair 
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{agt, -/(•))) describes the input-output relation equally well. As suggested 
in [^, we will circumvent this non-uniqueness issue by introducing the following 
assumption: 

Assumption 1. The impulse response has unitary (.2 gain, i.e. II 5 II 2 = 1; Oind 
the sign of its first non-zero element is positive. 


2.2 Notation and preliminaries 


Given a sequence of scalars we denote by a its vector representation, 

i.e. 

Oi 


e M"*. 


a 


m 


We reserve the symbol (g) to indicate the Kronecker product of two matrices 
(or vectors). We will make use of the bilinear property 


{A ® B){C ® D) = AC ® BD, 


where A, B, C, and D have proper dimensions. Denoting by vec(A) the colum¬ 
nwise vectorization of a matrix A, we recall that, for any two vectors a and b, 
vec(a6^) — a. Given a vector a G we introduce its n x p reshape as 




Oi 


^n(p—1) + 1 


^np 


Dnxp 


Given a G K™, The symbol T„(a) denotes the m x n Toeplitz matrix whose 
entries are elements of a, namely 


T„(a) = 


ai 

0 


0 

a2 

ai 

0 

0 

^m—l 

O-Tn-2 


^m—n 0 


^m—1 


n+1 


(3) 


Let 


5 = 


and 


0 O' 

0 


dm — 1 


P= [I s 


nmXp 


5” 


(4) 


(5) 


We have the following result, which will be used throughout the paper. 
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( 6 ) 


Lemma 1. Let a G and T„(a) be as in ([^. Then 

P{l®a) = T„(a), 

Proof. Note that 

Tn(a) = [a Sa S'^a ■ ■ ■ 5'”“^a] 

a 

= [I s 52 • • • 

= P(J (g) a) , 


(7) 


which proves the statement. □ 

Based on the equality stated by Lemma we extend the Toeplitz notation 
to matrices, that is, given A G we write 

T„(A) = P(I„0A) gR^^^^p. (8) 


3 Identification via overparameterized models 


In this paper, we deal with overparameterized approaches to Hammerstein sys¬ 
tem identification. To this end, we construct the matrices 


F = 


and 


f)i(uo) 

••• 

$ ^ T„(F) G 


cj)p(uN-l)^ 


xnp 


(9) 


( 10 ) 


Then, we can express the Hammerstein system dynamics problem by means of 
the linear regression model (see also [^) 


where 


y = + e, 


d = g 0c G 


( 11 ) 


( 12 ) 


This vector contains the n+p unknown parameters of the Hammerstein model. 
Thus, it constitutes an overparameterization with respect to the original pa¬ 
rameters c and g. A desirable property of any estimate of d is that it should be 
expressible as (121, namely as a Kronecker product of two vectors. We formalize 
this concept in the following definition. 


Definition 1. Let d G We say that d is a Kronecker overparameterized 

(KOP) vector if there exist g G K" and c G such that (121 holds. 
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The following lemma gives a property of KOP vectors. 

Lemma 2. Let § € be a KOP vector. Then TZn.pi'd) = and thus 
T^n,p{'&) is a rank-one matrix. 

Proof. Follows from the identity vec(c( 7 ^) = g ® c. □ 

Under Assumption an np-dimensional KOP vector d can be uniquely 
decomposed into the n- and p- dimensional vectors g and c. 


Proposition 1. Let d G be a KOP vector. Let g be the ith row and c and 
the jth column of TZn^p{'&), define 


Then d = g ® c, 


9 = ulij sign(gi), 
7 || = 1 and gi > 0. 


c=^||g||sign(5i). 

9 ] 


(13) 


Proof. From (131 we have that Cigj = Cj, so CiPj is the i,jth. element of TZn.p{d) 
hence TZn,p{d) = cg^. In addition 


llffll 


9 

Il5ll 


sign(gi) 


M 

llffll 


|sign( 5 i)| = 1, 


(14) 


and 

5l . /- X ISlI ^ n 

<,. = jgj|-Ento.) = igii>0. 

which completes the proof. 


(15) 

□ 


3.1 A review of an overparameterized method for Ham- 
merstein system identification 

In this section we review the identification procedure proposed in j^, which 
constitutes the starting point of our regularized kernel-based method. Given the 
model a. consistent estimates of c and g can be obtained with the following 
steps (see for details about consistency). First, we compute the least-squares 
estimate 

^LS ^ ( 16 ) 

Then, since we know that is a KOP vector, that is, the reshaping of d into 
an n X p matrix must be rank-one (Lemma , we approximate r)'"® to a KOP 
vector by approximating TZn^p{d^^) to a rank-one matrix. This can be done by 
solving the problem 


minimize || A — 7?,„^p(i?^®)||i? 
s.t. rank A = 1, 


(17) 
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where || • ||f denotes the Frobenius norm. Expressing 'R-n,p{r^) by means of its 
singular value decomposition, i.e. 

= USV^ (18) 

= ... uP]diag{si, ..., Sp}[u^ ••• v'^f, 

we find that the solution of is X = Then g = u^sign(u() (since 

we have assumed \\g \\2 = 1) and c = siit^sign(u(). 

Note that, since in general in not a KOP vector, generally S 2 , • ■ •, Sp > 0 
and the truncation required by the approximation (17) introduces a bias in the 
estimates g and c that degrades performance (see 14 ). Another drawback of 
this method is that it requires the least-squares estimate of the possibly high¬ 
dimensional vector Hence, despite its consistency property, the procedure 
can suffer from high variance in the estimates when N is not (very) large. 


4 A regularized overparameterization method for 
Hammerstein system identification 


In the previous section, we have seen that the estimator proposed in suffers 
from high variance and from a bias that degrades performance. To control 
the variance of the estimate, we can use regularization (for a full treatment, 
see [^, [^); this means we have to select some properties we want to enforce 
on the estimated vector. As we have pointed out in the previous section, a 
vector is a good candidate estimate of the unknown vector if it satisfies the 
following properties: 


1. '(9 is a KOP vector, so that it can be decomposed as in (121; 


2. The mean square error of is low, so that the estimated values g and c 
are close to the true values. 


A natural approach to incorporate (at least) the second property is based on 


regularization or, equivalently, on the Gaussian regression framework 29 . Thus, 
we model as a Gaussian random vector, namely 


i9^Af(0, H{p)). 


(19) 


where the covariance matrix (also called a kernel) H{p) is parameterized by the 
vector p. The structure of H{p) determines the properties of the realizations 


from (191 and, consequently, of the estimates of i?. In the next subsection, we 


focus on designing a kernel suitable for Hammerstein system identification that 
incorporates also the first property. 


4.1 The KOP kernel 

We first recall the kernel-based identification approach for LTI systems proposed 
in |20|, |21|, and we model also g as a realization of a zero-mean n-dimensional 
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Gaussian process. Then we have 




( 20 ) 


where the kernel Kp corresponds to the so-called first-order stable spline kernel 
(or TC kernel in j^). It is defined as 

^ ( 21 ) 


where the hyperparameter /3 is a scalar in the interval [0, 1). The choice of 
this kernel is motivated by the fact that it promotes BIBO stable and smooth 
realizations. The decay velocity of these realizations is regulated by /3. Typical 
formulations of the stable spline kernel (see e.g. [^) include a scaling factor 
multiplying the kernel, in order to capture the amplitude of the unknown im¬ 
pulse response. Here such an hyperparameter is redundant, as we are working 
under Assumpti on [T| 

To reconcile (|20|) with (191, we need to ensure that the transformation is 
a Gaussian vector, when g is Gaussian. This is possible if c is a (deterministic) 
p-dimensional vector. In this case d is an np-dimensional Gaussian random 
vector with covariance matrix 


H{p) = = E[((jr ® c){g ® c)^] = Kp® cc^ , (22) 

which is parameterized by the vector p = [P In this way, we have de¬ 

fined a new kernel for system identification based on overparameterized vector 
regression. We formalize this in the following definition. 

Definition 2. We the define the Kronecker overparameterized (KOP) kernel as 

H{p) = Kp® cc^, p=[f]c^f, (23) 

where Kp is as in ( pT| . 

Note that H{p) is rank-deficient, its rank being equal to n. Rank-deficient 
kernels for system identification have also been studied in [^. 


4.2 Estimation of the overparameterized vector f) 


We now derive the estimation procedure for the vector d. Recalling 
noise distribution is Gaussian and given the Gaussian description of d 
joint distribution of y and d is Gaussian. Hence, we can write 


that the 
(T^, the 


P 







where = ^H{p) and 

Yy = $Tr(p)$^ -kcr^I. 


(24) 


(25) 










In (241 we have highlighted the dependence of the joint distribution on the 
vector p and the noise variance tr^. Assume these quantities are given; then the 


minimum mean square error estimate of d can be computed as (see e.g. [^) 


d = E['d|j/; p, cr^] 


(26) 


To be able to compute (241 we first need to determine p and cr^. This can be 
done by maximizing the ML of the output data (see e.g. [^). Then we have 

p, 0 -^ = argmaxp(?/; p, 


= arg mm 

L 


log det T,y 




(27) 


The resulting estimation procedure for '& can be summarized by the following 
two steps. 


1 . 

2 . 


Solve (27) to obtain p, a‘‘. 


Compute (26) using the estimated parameters. 


Having obtained "d, it remains to establish how to decompose it in order to 
obtain the estimates g and c. In the next section we shall see that, using the 
proposed approach, such an operation becomes natural. 


5 Properties of the estimated overparameterized 
vector 

In this section, we analyze some properties of the regularized overparameteriza¬ 
tion estimate of d. In particular, we show that the estimates produced by (26) 


are KOP vectors. Then, we show that this procedure leads to exactly the same 
estimator as the one we proposed in j^; where the coefficients of the nonlin¬ 
earity were considered as model parameters and not included among the kernel 
hyperparameters. 

To further specify the equivalence, we first briefly review the Hammerstein 
system identification approach proposed in 
Gaussian process assumption. 


25 which is based on a different 


5.1 A review of the method proposed in 

Let W = T„(w) = Tn{Fc). Then we can model the measurements with the 
linear relation 

y = Wg + e. (28) 

Modeling g as a Gaussian random vector with covariance given by the stable 


spline kernel (21), we notice that a joint Gaussian description holds between y 
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and g. Hence we can write 


; c, cr^ 


'N 


^y.2 

'■^gy 


-“yg 


Hip) 


(29) 


where T,yg = WKp and + cr^l. Note that (291 depends on the 

parameters c, /3 and These parameters are estimated via ML maximization, 
that is by solving 




c, /3, = arg min log det Sj ^_2 + 


The minimum mean square estimate of g is then computed as 
g = E[g\y, c, j3, = KpW'^Y.yl^y . 


(30) 


(31) 


In the next section, we point out the strong connection between (311 and the 


estimate (26), produced by the KOP kernel-based regression approach. 


5.2 The estimate (26) is a KOP vector 


In this section we prove that, when the KOP kernel-based method is used to 


estimate (26), the resulting estimates can be decomposed as Kronecker products 


of lower-dimensional vectors and thus they are KOP vectors. Before arriving 
at this result we show the equivalence between the output measurement mod¬ 


els (11) and (281. 


Lemma 3. Let W = T„(Fc) and $ as in (10). Then 

$iL(p)$^ = WKpW, 

where Hip) and Kp are the KOP and the stable spline kernels. 


(32) 


Proof. Recalling the bilinear property of the Kronecker product and Lemma 
we see that 


$i?(p)$^= P[l^F]Hip) [I®F^]P'^ 

= P[I<S,F] [Kp O cc^] [/ O F^] P'^ 

= P [7 O T’] [J O c] [Kp O 1] [7 O [I®F'^P'^ 

= P[I® Pc] [Kp O 1] [7 O c^P^]P"^ 

= T„(Pc)7^^T„(Pc)^ = WKpW^ , 


which proves the result. 


□ 


Theorem 1. Consider the output measurement models (111 and (28). Then: 


1 . 

2 . 


The marginal likelihoods of y obtained from the two models are equivalent; 


The parameter estimates obtained from (27) and (30) are the same. 
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Proof. Let 


Piiy, P, cr'^) = J P {y, P, 0-^) dd 

(33) 

P 2 {y; C, p, cr^) = y P (y, y; c, /?, o-^) dg 

(34) 


be the marginal likelihoods of the two models. We find that 

Pi{y,P, o-^) = A/'(0,Sj^), 

P2{y;c, /3, 0 -^) = A/'(0,Sy, 2 ) • 


Using Lemma [3l we have that Ej, = Ej,_ 2 , hence pi and p 2 are equivalent. The 
same promptly holds for their ML maximizers p = [$ cF] and □ 

We are now in the position to prove that the estimate d is a KOP vector 


Theorem 2. Assume that p and are estimated using the ML approach (271 
(or, equivalently, ([30);. Then, the minimum variance estimate of d in ( |26| is 
such that 

d = g®c, (35) 


where g is the minimum variance estimate of g in (311 and c is the ML estimate 
ofc. 


Proof. Using ( |26[ ) and recalling the bilinear property of the Kronecker product 
and Lemma [2 we have 


d = = H{p)^^j:-^y 

= [K^ 0 c]W^E;1j/ 

= 1 ] 

= [iL^W^E-iy 0 c] . 

From Theorem [^we know that Ey = Ey_ 2 ; thus, recalling (311 we have 


(36) 


K.W^E-^y = g, 


so that (351 is obtained. 


(37) 

□ 


Corollary 1. The estimate d given in (26) is a KOP vector and TZn,p{d) is 
rank-one. 


Proof. Since from (351 we have d = g 0 c, d is a KOP vector. The second part 
of the statement follows directly from Lemma □ 
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Figure 2: Results of the Monte Carlo experiments for different SNR. Top: Fit 
in percent of the linear system impulse response. Bottom: Fit in percent of the 
nonlinear transformation. 


We can make an interesting observation, that further links the KOP estimate 
to our previous kernel based estimator: 

Corollary 2. The estimates of the nonlinearity coefficients c, found maximiz¬ 


ing (271 and those resulting from decomposing d as in (351 are the same. 


Proof. Follows directly from (361 and Theorem 


□ 


We have established that the estimate d produced using the procedure de¬ 
tailed in Section is a KOP vector. So, the estimates of the impulse response 
g and the nonlinearity coefficients c can be retrieved using (13l. The whole 
procedure is summarized in Algorithm 1. 


Algorithm 1: KOP kernel-based Hammerstein system identification 
Input: 

Output: {olLi 


1. Obtain p, solving (27) 


2. Estimate d using (26) 


3. Find g, normalizing i? by c (Proposition]^. 
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The result of the outlined regularization procedure applied to the overpa¬ 
rameterized vector, with a suitable rank deficient kernel, yields estimates that 
are equivalent to the ones provided by the procedure outlined in |25|. 


6 Numerical Experiments 


We evaluate the proposed algorithm with numerical simulations of Hammer- 
stein systems. We set up 4 experiments in order to test different experimental 
conditions. The experiments consist of 200 independent Monte Carlo runs each. 
At every Monte Carlo run, we generate random data and Hammerstein systems, 
according to the following specifics. 


• The linear subsystem model is of output error type: 

yit) = + e{t), (38) 

generated by picking 4 poles and 4 zeros at random. The poles and zeros 
were sampled in conjugate pairs ae~^^) with a uniform in [0.5, 0.95] 

and w uniform in [0,7r]. 

• The input nonlinearity is a polynomial of fourth order. It is a linear 
combination of Legendre polynomial basis functions, defined as 


P%{u) 


1 5 * 
2*d 9 m* 




(39) 


where j = 0, ..., 4. The coefficients c are chosen uniformly in [—1,1]. 

• The input to the system is Gaussian white noise with unit variance. 

• The experimental data consists in = 1000 pairs of input-output samples, 
simulated from zero initial conditions. 


• The measurement noise e{t) is Gaussian and white. Its variance is a 
fraction of the noiseless output variance, i.e. 


2 Var Wg 
~ SNR 

where SNR depends on the experiment. 


(40) 


Every experiment is carried out in a different signal to noise ratio (SNR) con¬ 
dition, see Table I. 

We aim at estimating n = 30 samples of the system impulse response of the 
LTI systems (which are such that |j (;||2 = 1 and with the sign of the first sample 
positive) and the p = 5 coefficients of the nonlinear block. We test the following 
estimation methods. 
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Experiment 1 2 3 4 

SNR 10 20 50 100 


Table 1: SNR considered in the experiments 


• KOP: This is the method described in this paper. The ML optimization 
problem is solved using the function fminseeirch available in MATLAB. 
The search was initialized with the elements of c uniformly sampled in 
[0,1], /3o = 0.5, and tr equal to the sample variance of the residuals of 


• LS-OP: This estimator implements the least-squares overparameterization- 
based method proposed in and briefly reviewed in Section 3.1 Note 
that, under the working experimental conditions, this method has to per¬ 
form a least-squares estimate of a 150-dimensional vector. 


NLHW : This is the MATLAB function nlhw that uses the prediction error 
method to identify the linear block in the system (see 17 for details). To 
get the best performance from this method, we equip it with an oracle 
that knows the true order of the LTI system generating the measurements 
and knows the order of the polynomial input nonlinearity. 


Note that all the methods have available the same amount of prior information, 
namely the orders of the input polynomial. The knowledge of th order of the 
linear block is known only to NLHW, which makes use of a parametric descrip¬ 
tion of the linear system. Furthermore, we note that, due to the Gaussianity of 
the noise, the least-squares procedure in LS-OP is optimal in the Gauss-Markov 
sense. 

We assess the accuracy of the estimated models using two performance in¬ 
dices. The first is the fit of the system impulse response, defined as 

FITg^i 4 100 f 1 - , (41) 


where gi is the system generated at the Tth run of each experiment, gi its 
estimate and gi its mean. The second is the fit of the static nonlinear function, 
given by 

FITf , = 100 f 1 - . (42) 

Figure 1 shows the results of the outcomes of the 4 experiments. The box plots 
compare the results of KOP, LS-OP and NLHW for the considered signal to 
noise ratios. We can see that, for high SNR, all the estimators perform well, 
especially in identifying the nonlinearity coefflcients. For lower SNR , however, 
the proposed method KOP performs substantially better than the others. This 
is mainly because of the regularizing effect of the KOP kernel that reduces 


14 










the variance of the estimates. Notice also that the proposed approach enforces 
the rank deficiency in the reshaped version of -d, so it circumvents the errors 
introduced by the rank-one approximation made by LS-OP. The main drawback 
of NLHW is that it relies on a high dimensional nonlinear optimization, as it 
needs to estimate all the parameters in the model. The proposed method is 
instead nonparametric, and does not rely on the knowledge of the order of the 
LTI system. 

7 Conclusions 

Regularization is an effective technique to control the variance of least squares 
estimates. In this paper we have studied how to improve popular overparameter¬ 
ization methods for Hammerstein system identification using Gaussian process 
regression with a suitable prior. To this end, starting from the stable spline 
kernel, we have introduced the KOP kernel, which we believe to be a novel 
concept in Hammerstein system identification. Using the KOP kernel, we have 
designed a regularized least-squares estimator which provides an estimate of the 
overparameterized vector. The impulse response of the LTI system and the coef¬ 
ficients of the static nonlinearity are then retrieved by suitably decomposing the 
estimated vector. In contrast with the original overparameterization method, 
this decomposition involves no approximation. An important contribution is 
showing that this procedure estimate is equivalent to our recently proposed 
kernel-based method [^. Using simulations, we have shown that the proposed 
method compares very favorably with the current state-of-the-art algorithms for 
Hammerstein system identification. 

The introduction of the KOP kernel possibly opens up for new effective 
system identification methods based on the combination of overparameterized 
vectors and regularization techniques. In fact, we believe that Hammerstein 
system identification is not the only problem where KOP kernels could find 
application. Another possible extension of the proposed method is the design 
of new kernels merging a kernel for the static nonlinearity and the stable spline 
kernel. The main issue with this approach is that, at least theoretically, the 
Gaussian description of the resulting overparameterization vector would be lost. 
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